Generalizing Syntactic Collocates for Creative Language Generation

نویسنده

David Hardcastle

چکیده

This paper presents the construction of a data source that supports the automatic generation of cryptic crossword clues in a system called ENIGMA. Cryptic crossword clues have two layers of meaning: a surface reading that appears to be a fragment of English prose, and a puzzle reading that the solver must uncover to solve the clue. The content expressed by the clue, and the input to the generation process, is a word play puzzle, such as an anagram, perhaps. In expressing this puzzle ENIGMA must choose language creatively, so that a separate, surface reading of the text is also generated – in effect translating a semantic input via a layered text to a new semantic output. To ensure that this surface text is meaningful, ENIGMA uses corpus data to determine which words can be combined meaningfully and which cannot.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rule-Based Extraction of English Verb Collocates from a Dependency-Parsed Corpus

We report on a rule-based procedure of extracting and labeling English verb collocates from a dependency-parsed corpus. Instead of relying on the syntactic labels provided by the parser, we use a simple topological sequence that we fill with the extracted collocates in a prescribed order. A more accurate syntactic labeling will be obtained from the topological fields by comparison of correspond...

متن کامل

Comparison of right hemisphere damage patients and normal adults in some linguistic performances

Introduction: According to some evidence, damage to the right hemisphere leads to impaired linguistic and cognitive functions. Patients with right hemisphere damage (RHD) experience difficulties at different levels of language. Assessing and diagnosing language disorders in RHD patients help to plan treatment programs. Therefore, the present study investigated some of the language functions in ...

متن کامل

Syntactic realization with data-driven neural tree grammars

A key component in surface realization in natural language generation is to choose concrete syntactic relationships to express a target meaning. We develop a new method for syntactic choice based on learning a stochastic tree grammar in a neural architecture. This framework can exploit state-of-the-art methods for modeling word sequences and generalizing across vocabulary. We also induce embedd...

متن کامل

Building a Collocational Semantic Lexicon

Natural Language Generation (NLG) systems require access to collocational information to help determine lexical choices constrained both by syntactic and semantic concerns. Constructing linguistic resources to support these decisions can be time-consuming whereas, if the information is extracted automatically, data sparsity limits the variety of the output. This paper reports on a method for ex...

متن کامل

Generation of Word Profiles on the basis of a large and balanced German corpus

Electronic corpora have been used in lexicography and the domain of language learning for more than two decades (cf. Braun et al. 2006, Sinclair 1991). Traditionally, computer platforms exploiting these corpora were based on concordances that present a word in its different contexts. However, concordances hit their limits for very large corpora where the result sets are generally too large for ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Generalizing Syntactic Collocates for Creative Language Generation

نویسنده

چکیده

منابع مشابه

Rule-Based Extraction of English Verb Collocates from a Dependency-Parsed Corpus

Comparison of right hemisphere damage patients and normal adults in some linguistic performances

Syntactic realization with data-driven neural tree grammars

Building a Collocational Semantic Lexicon

Generation of Word Profiles on the basis of a large and balanced German corpus

عنوان ژورنال:

اشتراک گذاری